Ingham County
Voxel-based 3D Detection and Reconstruction of Multiple Objects from a Single Image
Inferring 3D locations and shapes of multiple objects from a single 2D image is a long-standing objective of computer vision. Most of the existing works either predict one of these 3D properties or focus on solving both for a single object. One fundamental challenge lies in how to learn an effective representation of the image that is well-suited for 3D detection and reconstruction. In this work, we propose to learn a regular grid of 3D voxel features from the input image which is aligned with 3D scene space via a 3D feature lifting operator. Based on the 3D voxel features, our novel CenterNet-3D detection head formulates the 3D detection as keypoint detection in the 3D space. Moreover, we devise an efficient coarse-to-fine reconstruction module, including coarse-level voxelization and a novel local PCA-SDF shape representation, which enables fine detail reconstruction and one order of magnitude faster inference than prior methods. With complementary supervision from both 3D detection and reconstruction, one enables the 3D voxel features to be geometry and context preserving, benefiting both tasks. The effectiveness of our approach is demonstrated through 3D detection and reconstruction in single object and multiple object scenarios. Code is available at http://cvlab.cse.
Design, Dynamic Modeling and Control of a 2-DOF Robotic Wrist Actuated by Twisted and Coiled Actuators
Zhang, Yunsong, Zhou, Xinyu, Zhang, Feitian
Robotic wrists play a pivotal role in the functionality of industrial manipulators and humanoid robots, facilitating manipulation and grasping tasks. In recent years, there has been a growing interest in integrating artificial muscle-driven actuators for robotic wrists, driven by advancements in technology offering high energy density, lightweight construction, and compact designs. However, in the study of robotic wrists driven by artificial muscles, dynamic model-based controllers are often overlooked, despite their critical importance for motion analysis and dynamic control of robots. This paper presents a novel design of a two-degree-of-freedom (2-DOF) robotic wrist driven by twisted and coiled actuators (TCA) utilizing a parallel mechanism with a 3RRRR configuration. The proposed robotic wrist is expected to feature lightweight structures and superior motion performance while mitigating friction issues. The Lagrangian dynamic model of the wrist is established, along with a nonlinear model predictive controller (NMPC) designed for trajectory tracking tasks. A prototype of the robotic wrist is developed, and extensive experiments are conducted to validate its superior motion performance and the proposed dynamic model. Subsequently, extensive comparative experiments between NMPC and PID controller were conducted under various operating conditions. The experimental results demonstrate the effectiveness and robustness of the dynamic model-based controller in the motion control of TCA-driven robotic wrists.
Rapidly Built Medical Crash Cart! Lessons Learned and Impacts on High-Stakes Team Collaboration in the Emergency Room
Taylor, Angelique, Tanjim, Tauhid, Sack, Michael Joseph, Hirsch, Maia, Cheng, Kexin, Ching, Kevin, George, Jonathan St., Roumen, Thijs, Jung, Malte F., Lee, Hee Rin
Rapidly Built Medical Crash Cart! Lessons Learned and Impacts on High-Stakes Team Collaboration in the Emergency Room Abstract --Designing robots to support high-stakes teamwork in emergency settings presents unique challenges, including seamless integration into fast-paced environments, facilitating effective communication among team members, and adapting to rapidly changing situations. While teleoperated robots have been successfully used in high-stakes domains such as firefighting and space exploration, autonomous robots that aid high-stakes teamwork remain underexplored. T o address this gap, we conducted a rapid prototyping process to develop a series of seemingly autonomous robot designed to assist clinical teams in the Emergency Room. We transformed a standard crash cart--which stores medical equipment and emergency supplies into a medical robotic crash cart (MCCR). The MCCR was evaluated through field deployments to assess its impact on team workload and usability, identified taxonomies of failure, and refined the MCCR in collaboration with healthcare professionals. By publicly disseminating our MCCR tutorial, we hope to encourage HRI researchers to explore the design of robots for high-stakes teamwork. Teleoperated robots have become indispensable tools for action teams--highly skilled specialist teams that collaborate in short, high-pressure events, requiring improvisation in unpredictable situations [1]. For example, disaster response teams rely on teleoperated robots and drones to aid search and rescue operations [2], [3]. High-stakes military and SW A T teams use teleoperated ordnance disposal [4] and surveillance robots [5] to keep the teams safe. Surgical teams employ teleoperated robots to perform keyhole surgeries with a level of precision that would be unimaginable without these machines [6], [7]. We built three teleoperated medical crash cart robots (MCCRs). MCCR 1 delivers supplies using a hoverboard circuit. MCCR 2 delivers supplies, recommends supplies using drawer opening capabilities, and was deployed at a medical training event which revealed insights.
Co-Designing Augmented Reality Tools for High-Stakes Clinical Teamwork
Taylor, Angelique, Tanjim, Tauhid, Cao, Huajie, Nicoly, Jalynn Blu, Segal, Jonathan I., George, Jonathan St., Kim, Soyon, Ching, Kevin, Ortega, Francisco R., Lee, Hee Rin
How might healthcare workers (HCWs) leverage augmented reality head-mounted displays (AR-HMDs) to enhance teamwork? Although AR-HMDs have shown immense promise in supporting teamwork in healthcare settings, design for Emergency Department (ER) teams has received little attention. The ER presents unique challenges, including procedural recall, medical errors, and communication gaps. To address this gap, we engaged in a participatory design study with healthcare workers to gain a deep understanding of the potential for AR-HMDs to facilitate teamwork during ER procedures. Our results reveal that AR-HMDs can be used as an information-sharing and information-retrieval system to bridge knowledge gaps, and concerns about integrating AR-HMDs in ER workflows. We contribute design recommendations for seven role-based AR-HMD application scenarios involving HCWs with various expertise, working across multiple medical tasks. We hope our research inspires designers to embark on the development of new AR-HMD applications for high-stakes, team environments.
Cluster and Aggregate: Face Recognition with Large Probe Set Supplementary Material
To train the fusion network F which is comprised of SIM, CN and AGN, we set the batch size to be 512. We take the pretrained model E, which is IResNet-101 [2], trained on WebFace4M [15] with ArcFace loss [2] and freeze it without further tuning. For training CAFace, the number of images per identity N is randomly chosen between 2 and 16 during each step of training, and we take two sets per identity. The intermediate feature for the Style Input Component (SIM) is taken from the block 3 and 4 of the IResNet-101. The number of clusters in CN is varied in the ablation studies and fixed to be 4 for subsequent experiments.
Social media polarization during conflict: Insights from an ideological stance dataset on Israel-Palestine Reddit comments
Ali, Hasin Jawad, Abrar, Ajwad, Hossain, S. M. Hozaifa, Mridha, M. Firoz
In politically sensitive scenarios like wars, social media serves as a platform for polarized discourse and expressions of strong ideological stances. While prior studies have explored ideological stance detection in general contexts, limited attention has been given to conflict-specific settings. This study addresses this gap by analyzing 9,969 Reddit comments related to the Israel-Palestine conflict, collected between October 2023 and August 2024. The comments were categorized into three stance classes: Pro-Israel, Pro-Palestine, and Neutral. Various approaches, including machine learning, pre-trained language models, neural networks, and prompt engineering strategies for open source large language models (LLMs), were employed to classify these stances. Performance was assessed using metrics such as accuracy, precision, recall, and F1-score. Among the tested methods, the Scoring and Reflective Re-read prompt in Mixtral 8x7B demonstrated the highest performance across all metrics. This study provides comparative insights into the effectiveness of different models for detecting ideological stances in highly polarized social media contexts. The dataset used in this research is publicly available for further exploration and validation.
Propeller Motion of a Devil-Stick using Normal Forcing
Khandelwal, Aakash, Mukherjee, Ranjan
The problem of realizing rotary propeller motion of a devil-stick in the vertical plane using forces purely normal to the stick is considered. This problem represents a nonprehensile manipulation task of an underactuated system. In contrast with previous approaches, the devil-stick is manipulated by controlling the normal force and its point of application. Virtual holonomic constraints are used to design the trajectory of the center-of-mass of the devil-stick in terms of its orientation angle, and conditions for stable propeller motion are derived. Intermittent large-amplitude forces are used to asymptotically stabilize a desired propeller motion. Simulations demonstrate the efficacy of the approach in realizing stable propeller motion without loss of contact between the actuator and devil-stick.
Leveraging Large Language Models to Analyze Emotional and Contextual Drivers of Teen Substance Use in Online Discussions
Zhu, Jianfeng, Jin, Ruoming, Jiang, Hailong, Wang, Yulan, Zhang, Xinyu, Coifman, Karin G.
Adolescence is a critical stage often linked to risky behaviors, including substance use, with significant developmental and public health implications. Social media provides a lens into adolescent self-expression, but interpreting emotional and contextual signals remains complex. This study applies Large Language Models (LLMs) to analyze adolescents' social media posts, uncovering emotional patterns (e.g., sadness, guilt, fear, joy) and contextual factors (e.g., family, peers, school) related to substance use. Heatmap and machine learning analyses identified key predictors of substance use-related posts. Negative emotions like sadness and guilt were significantly more frequent in substance use contexts, with guilt acting as a protective factor, while shame and peer influence heightened substance use risk. Joy was more common in non-substance use discussions. Peer influence correlated strongly with sadness, fear, and disgust, while family and school environments aligned with non-substance use. Findings underscore the importance of addressing emotional vulnerabilities and contextual influences, suggesting that collaborative interventions involving families, schools, and communities can reduce risk factors and foster healthier adolescent development.
Sparse Modelling for Feature Learning in High Dimensional Data
Neelam, Harish, Veerella, Koushik Sai, Biswas, Souradip
This paper presents an innovative approach to dimensionality reduction and feature extraction in high-dimensional datasets, with a specific application focus on wood surface defect detection. The proposed framework integrates sparse modeling techniques, particularly Lasso and proximal gradient methods, into a comprehensive pipeline for efficient and interpretable feature selection. Leveraging pre-trained models such as VGG19 and incorporating anomaly detection methods like Isolation Forest and Local Outlier Factor, our methodology addresses the challenge of extracting meaningful features from complex datasets. Evaluation metrics such as accuracy and F1 score, alongside visualizations, are employed to assess the performance of the sparse modeling techniques. Through this work, we aim to advance the understanding and application of sparse modeling in machine learning, particularly in the context of wood surface defect detection.
Transient Adversarial 3D Projection Attacks on Object Detection in Autonomous Driving
Zhou, Ce, Yan, Qiben, Liu, Sijia
Object detection is a crucial task in autonomous driving. While existing research has proposed various attacks on object detection, such as those using adversarial patches or stickers, the exploration of projection attacks on 3D surfaces remains largely unexplored. Compared to adversarial patches or stickers, which have fixed adversarial patterns, projection attacks allow for transient modifications to these patterns, enabling a more flexible attack. In this paper, we introduce an adversarial 3D projection attack specifically targeting object detection in autonomous driving scenarios. We frame the attack formulation as an optimization problem, utilizing a combination of color mapping and geometric transformation models. Our results demonstrate the effectiveness of the proposed attack in deceiving YOLOv3 and Mask R-CNN in physical settings. Evaluations conducted in an indoor environment show an attack success rate of up to 100% under low ambient light conditions, highlighting the potential damage of our attack in real-world driving scenarios.